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CN \ We present the clustering properties of 151 Lyman-a emitting galaxies at 

z w 4.5 selected from the Large Area Lyman Alpha (LALA) survey. Our catalog 

t— 5 \ covers an area of 36' x 36' observed with five narrowband filters. We assume that 

\Q [ the angular correlation function w(8) is well represented by a power law A W Q~^ 

i — ,| with slope (3 = 0.8, and we find A w = 6.73 ± 1.80. We then calculate the corre- 

lation length tq of the real-space two-point correlation function £(r) = (r/ro)~ L8 
from A w through the Limber transformation, assuming a flat, A-dominated uni- 
verse. Neglecting contamination, we find r = 3.20 ± 0.42 h~ x Mpc. Taking into 
account a possible 28% contamination by randomly distributed sources, we find 
ro = 4.61 ± 0.6 h" 1 Mpc. We compare these results with the expectations for 
the clustering of dark matter halos at this redshift in a Cold Dark Matter model, 

£T) ; and find that the measured clustering strength can be reproduced if these objects 

reside in halos with a minimum mass of 1-2 x I^Ii^Mq. Our estimated corre- 
lation length implies a bias of b ~ 3.7, similar to that of Lyman-break galaxies 
(LBG) at z ~ 3.8 — 4.9. However, Lyman-a emitters are a factor of ~ 2-16 rarer 

^ 1 than LBGs with a similar bias value and implied host halo mass. Therefore, 

one plausible scenario seems to be that Lyman-a emitters occupy host halos of 
roughly the same mass as LBGs, but shine with a relatively low duty cycle of 

>< : 6-50%. 
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1. Introduction 

Galaxy clustering provides a powerful tool for testing cosmological models and galaxy 
formation models, through quantitative comparisons between predicted and observed clus- 
tering statistics. The galaxy two-point correlation function is the most widely used of these 
statistics, thanks to its straightforward calculation and its direct relationship to the galaxy 
power spectrum. It has been established for decades that the two-point correlation function 
is reasonably well described by a power law over a range of distances between the observed 
galaxies (see pioneering works of Totsuji & Kihara 1969 and Peebles 1974). 

Large redshift surveys of galaxies, such as the two-degree Field Galaxy Redshift Survey 
(2dFGRS; Colless et al. 2001) and the Sloan Digital Sky Survey (SDSS; Loveday 2002) 
provide precise measurements of the clustering signal for redshift z ~ 0. Their size makes it 
possible to investigate the dependence of the clustering signal on intrinsic galaxy properties, 
such as morphology or luminosity. Red galaxies are clustered more strongly, and their power 
law is steeper, compared to the power law which describes the clustering properties of blue 
galaxies (e.g. Norberg et al. 2002; Zehavi et al. 2002, 2004). This conclusion is in agreement 
with the results from surveys at intermediate redshifts about the clustering properties of 
galaxies of different color (Le Fevre et al. 1996; Carlberg et al. 1997). These surveys also 
detect redshift evolution in galaxy clustering. Recently, surveys have achieved a sufficient 
size and uniformity to detect the small deviations between real correlation functions and 
pure power law fits (Zehavi et al 2004; Zheng 2004). 

Identification of large high-redshift galaxy samples using multiband color selection tech- 
niques (Meier 1976; Madau et al. 1996; Steidel et al. 1996, 1998) has opened the way for 
studies of luminosity functions and correlation functions in the distant universe (Giavalisco 
et al. 1998; Adelberger et al. 1998; Adelberger et al. 2000; Ouchi et al. 2003; Shimasaku 
et al. 2003; Hamana et al. 2003; Brown et al. 2005; Allen et al. 2005; Lee et al. 2006). 
Galaxies selected in these broad band photometric surveys are expected to have broadly 
similar properties and lie in a restricted redshift interval (Az ~ 1). 

Lyman-a emission offers an alternative method for finding high redshift galaxies. These 
are typically star-forming galaxies with smaller bolometric luminosities than the usual continuum- 
selected objects. These samples do not appear to contain substantial numbers of active 
galactic nuclei (Malhotra et al. 2003; Wang et al. 2004; Dawson et al. 2004). 
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In the modern picture of galaxy formation, based on the Cold Dark Matter (CDM) 
model, galaxies form in dark matter halos which evolve in a hierarchical manner. Here, 
the clustering pattern of galaxies is determined by the spatial distribution of dark matter 
halos and the manner in which dark matter halos are populated by galaxies (Benson et 
al. 2000; Peacock & Smith 2000; Seljak 2000; Berlind & Weinberg 2002). Galaxy surveys 
provide constraints on the galaxy distribution. The dark matter distribution is estimated 
using N-body simulations or an analytical approach, generally based on the Press-Schechter 
formalism (Press & Schechter 1974) and its extensions (Sheth et al. 2001; Sheth & Tormen 
2002). The statistical relation between galaxies and the dark matter halos where they are 
found can be described empirically using a "halo occupation function" (e.g. Moustakas & 
Somerville 2002), which describes the probability of an average number iV galaxies being 
found in a halo as a function of halo mass. 

In this article we describe the clustering properties of galaxies selected through their 
Lyman-o; emission at z ~ 4.5. In section 2 we present the data used in this paper and describe 
the selection of the Lyman-a candidates. In section 3 we present the correlation function 
analysis and results. We compare these results to the prediction of CDM theory in section 4. 
A discussion and a summary of our main conclusions are given in section 5. For all calcula- 
tions we adopt a ACDM cosmology with Q M = 0.3, = 0.7, H = 70kms _1 Mpc -1 and the 
power-spectrum normalization <r 8 = 0.9. We scale our results to h = H /(100 km s~ x Mpc -1 ). 

2. The LALA sample 

The Large Area Lyman Alpha (LALA) survey started in 1998 as a project to identify 
a large sample of L yet-emitting galaxies at high redshifts (Rhoads et al 2000). Over 300 
candidates have been identified so far at z ~ 4.5 (Malhotra & Rhoads 2002), with smaller 
samples at z « 5.7 (Rhoads & Malhotra 2001; Rhoads et al. 2003) and z « 6.5 (Rhoads et 
al. 2004). The search for Lyman-a emitters is realized through narrowband imaging using 
the wide-field Mosaic camera at Kitt Peak National Observatory's 4m Mayall telescope. Two 
fields of view of 36' x 36' are observed, covering a total area of 0.72 deg 2 . In this article 
we discuss the properties of the Lyman-a emitters selected from Bootes field, centered at 
14h25m57s, +35°32' (2000.0) at z ~ 4.5. Full details about the survey and data reduction are 
given in Rhoads et al. (2000) and Malhotra & Rhoads (2002). Five overlapping narrowband 
filters of width FWHM 8 nm are used. The central wavelengths are 655.9, 661.1, 665.0, 
669.2, and 673.0 nm, giving a total redshift coverage 4.37 < z < 4.57. This translates into a 
surveyed volume of 7.3 x 10 5 comoving Mpc 3 per field (Rhoads et al. 2000). 

Corresponding broadband images are obtained from the NOAO Deep Wide-Field Sur- 
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vey (Jannuzi & Dey 1999) in a custom filter and the Johnson- Cousins R and I filters. 
Candidates are selected using the following criteria. In narrowband images candidates have 
to be 5 a detections where a is the locally estimated noise. The flux density in narrowband 
images has to exceed that in the broadband images by a factor of two. This corresponds to 
a minimum equivalent width (EW) of Lya of 80A in the observer frame, which helps cut 
down foreground emitters. Additionally, the narrowband flux density must exceed the broad 
band flux density at the 4<r level or above. Finally, candidates that are detected in B w band 
image at > 2er are rejected, as such blue flux should not be present if the source is really at 
high redshift. 

These selection criteria were followed by visual inspection. In the overlapping area of all 
5 narrowband filters, we selected a total of 151 candidate Lya emitters. More information 
about the sample is summarized in Table 1, where we give the number of candidates as 
detected in each of the filters. Because the filters overlap in wavelength, many objects were 
selected in more than one filter. Thus, the total of the sample sizes for the five individual 
filters exceeds the size of the merged final sample. 



3. Two point correlation function 

3.1. The w(9) estimation 

The angular correlation function w(8) is defined such that the probability of finding 
two galaxies in two infinitesimal solid angle elements of size SQ, separated by angle 6, is 
(1 + w(8)) £ 2 5f2 2 , where £ is the mean surface density of the population. Typically, w(8) is 
measured by comparing the observed number of galaxy pairs at a given separation 9 to the 
number of pairs of galaxies independently and uniformly distributed over the same geometry 
as the observed field. A number of statistical estimators of w(9) have been proposed (Landy 
& Szalay 1993; Peebles 1980; Hamilton 1993). 

We calculate the angular correlation function using the estimator w(6) proposed by 
Landy & Szalay (1993) 

= DD(&)-2DR(e) + RR(&) 
RR[Q) 

where DD(8) is the number of pairs of observed galaxies with angular separations in the 
range (6, 9 + 56), RR(9) is the number of random pairs for the same range of separations and 
DR{6) is the analogous number of observed-random cross pairs. Each of these parameters: 
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DD(6), RR(8) and DR(6) is normalized with the total number of pairs in the observed, 
random and cross-correlated observed-random sample respectively. 

Due to the small number of galaxies detected in the individual filters we perform w(8) 
calculations for the total sample consisting of 151 galaxies (numbers are given in Table 1). 
We are not able to resolve galaxies which are separated by less than 1 arcsecond from each 
other, thus we used this value as the smallest distance in the calculation of number of pairs. 
The random sample consists of 1000 individual catalogs, which have been generated to have 
the same number of objects and the same geometry as the observed field. Formal errors 
are estimated for every bin using the relation (1 + w{6))/yfDD as an approximation of the 
Poisson variance, which is very good estimation of the noise in the case of w(8) estimator 
(Landy & Szalay 1993). Our data show a strong correlation in the innermost bins, but the 
estimated w{6) value approaches zero rapidly at 6 > 40". 

It is generally assumed that w(6) is well represented by a power law A W 6~P. From the 
estimated w{6) values for our data set, we conclude that there are not enough bins with 
significant power for us to estimate both the amplitude and the slope of the correlation 
law. For further calculations we therefore adopt the fiducial slope f3 = 0.8. This value is 
within the range for published Lyman break samples (see, e.g., Giavalisco et al. 1998), and 
is moreover consistent with results for a flux limited sample of over 10 5 low redshift galaxies 
from the SDSS (Zehavi et al. 2004). 

We use the y 2 method to obtain the amplitude of the power law fitted to the estimated 
w{6) points, using the assumed slope of (3 = 0.8. The best-fit amplitude is A w = 6.73 ± 1.80 
for 9 in arcseconds (Figured]), obtained with % 2 =1.90 total (weighting the points with the 
modeled values). The confidence interval for the derived amplitude is estimated from the 
Monte Carlo simulations in the following manner. We create a set of 10000 random realiza- 
tions of w(6) values modeling them with a power law with the above estimated amplitude 
A w and slope (3 = 0.8 assuming normal errors (Press et al. 1992). For every realization of 
w(6) values we obtain the best-fit amplitude using the y 2 minimization process, fitting a 
power law with the fiducial value of the slope (3. The resulting distribution of the estimated 
amplitudes is given on the left panel of Figure [2J 

Estimates of w{6) require an estimate of the background galaxy density. We base 
our density estimate on the survey itself. We therefore need to account for uncertainty in 
the background density due to cosmic variance in the local number density in our survey 
volume. This bias, known as the "integral constraint", reduces the value of the amplitude 
of the correlation function by the amount (see e.g. Peebles 1980) 
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Here Q corresponds to the solid angle of the survey. The last integral can be approximated 
with the expression (Roche et al. 2002) 

_ J2RRA W 6-P 

6 " ERR ' { ) 

Summing over the observed field we calculate C = 0.00456. This value is small and we 
neglect it in further calculations. 



3.2. The real-space correlation length r 

In the previous subsection we presented the measurement of the correlation signal be- 
tween galaxies projected on the sky. If the redshift distribution of the observed galaxies 
N(z) is known, the spatial correlation function can be obtained from the angular correlation 
function using the inverse Limber transformation (Peebles 1980; Efstathiou et al. 1991). 
In the case of the power law representation of the angular correlation function, the spatial 
correlation function is also in power law form and it can be written as 



£( r ) = (r/r )~ 



(4) 



The slope 7 is related to the slope (3 by 7 = (3 + 1. The amplitudes of the power law 
representation of angular and spatial correlation functions are related by the equation : 



A w = Crl \ F(z)Dl^(z)N(z) 2 g(z)dz 
Jo 



N(z)dz 



(5) 



Here D e is the angular diameter distance, 



9(z) 



^[(1 + zf{\ + n M z + n A [(i + z)- 2 - 



(6) 



and C is a numerical factor given by 



(7) 
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where T stands for the Gamma function. The function F(z) describes the redshift dependence 
of £(r), and we take F = constant given the small redshift range covered in our survey. For 
the assumed cosmological model and the galaxy redshift distribution described with a top- 
hat function in the redshift interval 4.37 < z < 4.57, we calculate the correlation length r 
of the Lyman-a galaxies to be tq = 3.20 ± 0.42 h^ 1 Mpc. The 1 o confidence interval is 
estimated using synthetic values of A w created in simulations. The distribution of correlation 
lengths shows smaller scatter than the corresponding distribution of amplitudes (Figure [2]). 

The observed clustering signal may be diluted if our sample is contaminated by fore- 
ground sources. From the spectroscopic follow-up of selected Lyman-a emitters at z ~ 4.5 
the fraction of the contaminants is f cont ~ 28% (Dawson et al. 2004). Presence of foreground 
sources can reduce by a maximum factor of (1 — f C ont) 2 assuming no correlation between 
the contaminants. Following this assumption (i.e., no correlation between the contaminants) 
the contamination-corrected spatial correlation length for our sample is r$ = 4.61 ±0.60 h^ 1 
Mpc. The corrected r value corresponds to the maximum correlation length permitted for 
the sample studied. All results discussed in the following text based on the contamination- 
corrected correlation lengths should be therefore understood as the upper limits. 

Figure [3] shows the observed correlation length ro (in comoving units) of Lyman-a 
galaxies at redshift z ~ 4.5 in our sample, together with ro values for a number of surveys 
covering the redshift interval < z < 5. Two points represented with circles in Figure [3] 
are measures of the correlation strength from the two samples of Lyman-a galaxies. The 
correlation length estimated from our sample at z ~ 4.5 (filled circle in Figure [3]) is in 
very good agreement with the correlation length ro = 3.5 ± 0.3 h^ 1 Mpc for the sample of 
Lyman-a galaxies at z — 4.86 (empty circle in Figure [3]) obtained by Ouchi et al. (2003). 

A discrepancy arises when comparing the contamination corrected correlation lengths 
from these two samples. In the following we address exactly this issue in more detail. Ouchi 
et al. (2003) use Monte Carlo simulations to assess the contamination of the sample by 
foreground sources. Briefly, by generating the large number of sources created to correspond 
to the detected sources, distributing them randomly into the two real broadband and one 
narrowband images, and consequently using the same detecting criteria as for the real sources, 
Ouchi et al. (2003) find that the maximum fraction of contaminants is about 40%. The 
contamination by foreground sources increases the correlation length up to the maximum 
permitted value of 6.2 ± 0.5 h^ 1 Mpc, quoted in Ouchi et al. 2003, much larger than our 
maximum permitted correlation length of 4.61 ± 0.6 h^ 1 Mpc. Even though the sample 
of Lyman-a emitters studied by Ouchi et al. (2003) is peculiar - galaxies studied in the 
discussed paper belong to a large-scale structure of Lyman-a emitters discussed into detail 
in Shimasaku et al. (2003) - we believe that the reason for the discrepancy between the 
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contamination corrected correlation lengths lies in the different methods used to estimate 
the fraction of foreground sources. While our estimate is based on the spectroscopic follow- 
up, the fraction of contaminants derived in Ouchi et al. (2003) is based purely on the 
photometric data. Shimasaku et al. (2003) discuss the sample of Lyman-a emitters at 
z = 4.86, extending the sample presented in Ouchi et al. (2003) with additional Lyman- 
a emitters. These emitters are detected in the field which partially overlays and partially 
extends in the direction of the observed overdensity of Lyman-a emitters studied by Ouchi 
et al. (2003). Shimasaku et al. (2003) use the same criteria as Ouchi et al. (2003) to define 
the Lyman-a emitters, except the limiting magnitude of the Lyman-a candidates in the 
narrowband images, which is half a magnitude lower. Shimasaku et al. (2003) include the 
spectroscopic follow-up to test their photometric selection (the spectroscopic sample contains 
5 Lyman-a candidates). The fraction of foreground contaminants estimated by Shimasaku et 
al. (2003) using both the photometric and spectroscopic data is about 20 %, two times lower 
than the fraction of low-z contaminants estimated in Ouchi et al. (2003). Using the updated 
fraction of contaminants to be valid also for the sample of Lyman-a emitters discussed in 
Ouchi et al. (2003), the maximum permitted correlation length of that sample would be 
r = 4.5 ± 0.4 hr 1 Mpc, assuming no correlation between the contaminants. This value 
is again in very good agreement with our estimate of the maximum correlation length of 
ro = 4.61 ± 0.60 h~ x Mpc corrected for the dilution of the sample of Lyman-a emitters with 
the low-z galaxies. 

However, one should bare in mind that the correlation properties of the sample of 
Lyman-a emitters studied by Shimasaku et al. (2003) differs from the correlation properties 
of the sample presented in Ouchi et al. (2003). The angular correlation function of Lyman-a 
emitters at z = 4.86 is no longer well described by the power law of the angular distance: 
it is practically flat taking values w ~ 1-2 at distances < 8 arcmin, except for the point 
at 0.5 arcmin (Shimasaku et al. 2004). The authors claim that the constant amplitude of 
the angular correlation function is largerly implied by the large-scale structure and the large 
void regions in the observed field (see Figure 3 in Shimasaku et al. 2003 or slightly modified 
Figure 3 in Shimasaku et al. 2004). Moreover, Shimasaku et al. (2004) searched the same 
field for the Lyman-a emitters at redshift z = 4.79 (using the imaging in the additional 
narrowband filter) and find only weak clustering of these Lyman-a emitters on any scale. 
These results point out that there exists a large cosmic variance of clustering properties of 
Lyman-a emitters on scales of ~ 35 h^ 1 Mpc (Shimasaku et al. 2004). 

The measured r values of Lyman-a emitters (presented in Figure [3]) are comparable to 
the r values of LBGs. More generally, the correlation length of sources observed at high 
redshifts are smaller for about a third of the ro values measured for the nearby galaxies. 
When corrected for the contamination by low-z objects, the maximum permitted correlation 
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lengths of the samples studied at high redshifts are practically consistent with the value of 
the correlation length at zero redshift. 

Groth & Peebles (1977) proposed a theoretical model to describe the redshift evolution 
of the correlation length, the so called "e-model". In comoving units this model has the 
following form 

r ( z ) = r {z=0) x (1 + z )~^-^h . ( 8 ) 

For the fiducial slope of the correlation power law 7 = 1.8, the parameter e = 0.8 corre- 
sponds to the evolution of correlation function as expected in linear perturbation theory for 
a Universe with Q = 1. For e = —1.2, the clustering pattern is fixed. We use normalization 
tq(z = 0) = 5.3 ft. Mpc to calculate the modeled redshift evolution of the correlation length. 
The measurements of the correlation length of the Lyman-a emitters do not follow the red- 
shift evolution of correlation length predicted by the 'e-model' (short- and long-dashed lines 
in Figure [3]) . We conclude that there is no value of e for which equation [8] can fit the observed 
correlation lengths measured for the full range < z < 5. Similar conclusions have been 
presented by a number of authors (Giavalisco et al. 1998; Connolly et al. 1998; Matarrese 
et al. 1997; Moscardini et al. 1998). 

This implies that the population of Lyman-a galaxies at 4 < z < 5 is much more 
strongly biased than the low redshift galaxy samples shown in Figure [3J 

Figure [3] can not be straightforwardly interpreted as the redshift evolution of the corre- 
lation length, given that the correlation length of the Lyman-a emitters (and similarly of the 
LBGs) does not necessarily track that of the general population of galaxies. Typically, high 
redshift systems have been selected using the Lyman-break or Lyman-a techniques, which 
are sensitive to detect galaxies actively forming stars. Proper comparison of the values of 
correlation lengths of galaxies at low and high redshifts would require to select the local 
sample using the same criteria as to detect high redshift sources. For example, Moustakas & 
Somerville (2002) study three populations of galaxies (local giant ellipticals, extremely red 
objects and LBGs) observed at the three different redshifts (z ~ 0, z ~ 1.2 and z ~ 3, respec- 
tively) with clustering lengths of similar values. The masses of the host dark matter haloes, 
obtained from the clustering analysis, of populations observed at different epochs were dif- 
ferent, implying that these populations of objects do not have the same origin. Therefore the 
values of the clustering strength measured for the population of galaxies residing at low and 
high redshifts (possibly corrected for the contaminants) can not be used to make definite 
conclusions about the evolution of the clustering properties of all galaxies. More under- 
standing of the evolution of galaxies can be gained by comparing the clustering properties 
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of haloes which can host this type of galaxies at a specific epoch. 



4. Comparison with CDM 



Using the correlation length and the comoving number density estimated from the ob- 
servations of Lyman-a emitters at z ~ 4.5, we can constrain the possible masses of the host 
dark matter halos of the observed population. We compute the implied 'bias' of the Lyman-a 
emitters, i.e. how clustered they are relative to the underlying dark matter in our assumed 
cosmology. Readers should be cautioned that there are different definitions of bias used in 
the literature, and bias is also a non-trivial function of spatial scale. Quoted numerical bias 
values depend on these assumptions. We define the bias as the square root of the ratio of 
the galaxy and dark matter real-space correlation functions: 



where we have assumed that both the galaxy and dark matter correlation functions £ are 
represented by a power-law, with slope 7 g = 1.8 for the galaxies and 7dm = 1.2 for the 
dark matter (as measured in N-body simulations of Jenkins et al. 1998). We compute our 
bias values at a comoving spatial scale of 3.6 Yr 1 Mpc, which corresponds to an angular 
separation of 100 arcsec at z — 4.5, approximately the largest scale where we obtain a robust 
signal in our measured correlation function, and is the same scale used in several other recent 
analyzes (e.g. Lee et al. 2006). 

In order to predict the clustering properties of an observed galaxy population, we must 
consider both (a) the expected clustering of the underlying dark matter halos at a given 
redshift and in a given cosmology, and (b) the halo occupation function, or the number of 
objects residing within dark halos of a given mass. This function is dependent on the survey 
redshift and sample selection method. The halo occupation function (or distribution) may 
be parameterized with varying levels of complexity. Here, we use a very simple formulation, 
following Wechsler at al. (2001), Bullock et al. (2002), and Moustakas & Somerville (2002). 
We define N g (M) to be the average number of galaxies found in a halo with mass M, and 
parameterize this via a three-parameter function: 



b = (G/Cdm) 



1/2 



(9) 




(10) 



The parameter M min represents the smallest mass of a halo that can host an observed galaxy 
(N g = for M < Mmin). The normalization M 1 is the mass of a halo that will host, on 
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average, one galaxy. The slope a describes the dependence of the number of galaxies per 
halo on halo mass. Though extremely simple, this functional form has been widely used 
and has been found to be a reasonably good approximation to the halo occupation function 
predicted by semi-analytic models and hydrodynamic simulations (e.g. Wechsler et al. 2001; 
White et al. 2001). 

We compute the halo mass function using the analytic expression provided by Sheth & 
Tormen (1999): 



(11) 

Here, the parameters a = 0.707, p = 0.30 and c = 0.163 are chosen to match the halo number 
density from N-body simulations. The parameter v is defined by v = 5 c /cr, where S c ~ 1.686 
is the critical overdensity for the epoch of collapse and a is the linear rms variance of the 
power spectrum on the mass scale M at redshift z. Sheth & Tormen (1999) also give the 
halo bias bh in the form 



dnh 
dM 



p da 
MdM 



c 



[I + (au 2 )~ p ] exp 



2 



, / x av 2 — 1 2»M r . , 

MA0 = i + — + (12) 

Now, the integral of the halo mass function weighted by the halo occupation function 
rives the comoving number density of galaxies: 



ng= r ^(^w^ ( i3 ) 

J M m i n 

Similarly, the integral of the halo bias as a function of mass weighted by the occupation 
function gives the average bias for galaxies: 



b g = - T ^(M)b h (M)N g (M)dM. (14) 

We first consider the simplest case, in which each dark matter halo above a minimum 
mass contains a single Lyman-a emitter (i.e., N g = 1 for M > M min ). The comoving 
number density and bias values for the Lyman-a sample, both uncorrected and corrected for 
contamination, are shown in Figure HJ along with the relation between number density and 
average bias for dark matter halos as a function of the minimum mass. The number density 
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and bias values for Lyman-break galaxies (LBGs) at z ~ 3.8 (B-dropouts) and z ~ 4.9 (V- 
dropouts) and for three different observed magnitude limits (zsso — 26, 26.5, 27.0) from the 
recent study of Lee et al. (2006) are also shown. We recalculate the bias values from the Lee 
et al. (2006) sample using our definition of bias (Equation [H]); Lee et al. (2006) define the 
bias using the angular correlation function. From Figure H] it is apparent that there is a clear 
trend for Lyman-a emitters to be less common than halos that are as strongly clustered at 
their observed redshift. This may imply that Lyman-a is detected in only a fraction of the 
halos that host the objects producing the emission. It is also interesting that the Lyman-o; 
emitters have similar bias values to the LBG samples at similar redshifts, but again have 
much smaller number densities. This suggests a picture in which the host halos for these 
two populations may have a similar distribution of masses, but in which Lyman-a emission 
is seen only a fraction of the time. 

We now consider the general halo occupation function given above, and invert the 
equations for b g and n g to solve for the parameters M m i n and M\. As noted by Bullock et 
al. (2002), and exploited by several recent studies such as Lee et al. (2006), we can only 
constrain the value of the halo occupation function slope a if we have information on the 
clustering of objects on rather small angular scales. Here we do not have this information 
(we have only one measurement of the correlation function on scales smaller than 10 arcsec), 
so our solutions are degenerate in this parameter. We give the values of our obtained halo 
occupation parameters for three values of a in Table 2: a = (one galaxy per halo), a = 0.5, 
and a = 0.8. We note that Bullock et al. (2002) found a best-fit value of a = 0.8 for LBGs 
at z ~ 3, while Lee et al. (2006) found best fit values of a = 0.65 and a = 0.8 for z ~ 3.8 
(B-dropout) and z ~ 4.9 (V-dropout) LBGs, respectively. 

We see from Table 2 that the minimum host halo masses range from ~ 1.6-4x 10 10 h~ 1 M Q 
using the uncorrected values of number density and bias, and larger values ~ 1.3-2.5 x 
1O 11 /i~ 1 M for the values obtained when we corrected for possible contamination of our 
sample by foreground objects. In general, Mi is much larger than M m j n , again reflecting that 
the Lyman-a emitters' number densities are low relative to the halos that cluster strongly 
enough to host them. 

5. Discussion and Conclusions 

We have estimated the correlation properties of Lyman-a emitters from the LALA 
sample at z ~ 4.5. From the observed data we measure the amplitude of the angular two- 
point correlation function A w = 6.73 ± 1.80 assuming a fiducial value of the slope of modeled 
power law (3 = 0.8. Using the inverse Limber transformation for the given cosmology and 
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the top-hat redshift distribution of the analyzed galaxies in the interval 4.37 < z < 4.57 we 
calculate the spatial correlation length to be r = 3.20 ± 0.42 h~ x Mpc. After correcting for 
the possible contamination of the sample by uncorrelated sources (assuming a contaminant 
fraction of 28% based on spectroscopic surveys), we obtain r = 4.61 ± 0.60 h^ 1 Mpc. This 
is the maximum permitted value of the correlation length for our sample. 

While large scale structure in the form of voids and filaments is seen in Lyman-a emitters 
(Campos et al.1999; M0ller & Fynbo 2001; Ouchi et al. 2005, Venemans et al. 2002; Palunas 
et al. 2000; Steidel at al. 2000), the measurement of the correlation function is finely balanced 
between detection (this paper and Ouchi et al. 2003) and non-detection (Shimasaku et al. 
2004). Similar to this work, Murayama et al. (2007) measure the weak clustering of Lyman- 
a emitters on small scales (less than 100 arcsec), which can be well fitted by a power law. 
Ouchi et al. (2004) find correlation at scales of 9 > 50 arcsec in a field where they see a well 
defined clump of Lyman-a emitters. The distribution of Lyman-a emitters from the survey 
of Palunas et al. (2004), targeted on a known cluster at z = 2.38, show a weak correlation 
(significant excess of close pairs with separation less than 1 arcmin) and an excess of large 
voids (size of 6 - 8 arcmin). Our detection is at a smaller scale ( 9 < 50 arcsec) in a field 
with no noticeable clumping. The spatial correlation length we derive agrees within the la 
error with the estimate at z = 4.86 by Ouchi et al. (2003), who measured r = 3.5 ± 0.3 
hT 1 Mpc. On the other hand, the maximum permitted r value of Lyman-a emitters in our 
sample is significantly lower than the maximum permitted value estimated by Ouchi et al. 
(2003) of 6.2± 0.5 h^ 1 Mpc. The 40% fraction of low-z contaminants in the mentioned work 
was derived using only the photometric data. Shimasaku et al. (2004) included the data 
of the spectroscopic follow-up of the enlarged field observed by Ouchi et al. (2003), and 
derived a lower fraction of contaminants of 20%. Using this value for the contamination by 
low-z galaxies, the maximum permitted correlation length discussed in Ouchi et al. (2003) 
would be r = 4.5 ± 0.4 h~ x Mpc, assuming no correlation between the contaminants. This 
fraction of the low-z contaminants brings our and Ouchi et al. (2003) maximum permitted 
correlation length back into agreement. 

The r values of Lyman-a emitters measured at high redshifts are about 2/3 of the 
measured spatial correlation length of galaxies in the nearby Universe, or almost equal when 
comparing the contamination corrected correlation lengths of the discussed Lyman-a. The 
high values of the correlation length at high redshifts, measured for the specifically selected 
samples of galaxies, which are as high as the correlation length measured at the low red- 
shift, for more general populations of galaxies, do not imply the absence of the evolution in 
correlation length. 

We compare the measured clustering values with the expected clustering of dark matter 
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and dark matter halos in the CDM paradigm. We find that the Lyman-o emitters are 
strongly biased, b ~ 2.5-3.7, relative to the dark matter on scales of 3.6/i _1 Mpc at z = 4.5. 
These bias values imply that the Lyman-o emitters must reside in halos with minimum 
masses of 1.6-4 x 10 10 /i _1 M Q (uncorrected) or ~ 1.3-2.5 x 1O 11 /i^ 1 M using the results 
after correction for contamination. Interestingly, the observed number density of Lyman-o 
emitters is a factor of ~ 2-16 lower than that of dark matter halos that cluster strongly 
enough to host them. We further notice that the observed bias of Lyman-o emitters is 
similar to that of Lyman-break galaxies at z ~ 3.8 and z ~ 4.9, but again, the number 
density of the Lyman-o emitters is much lower. This suggests a picture in which the parent 
population of Lyman-o emitters may occupy dark matter halos with a similar distribution 
of masses as those that host LBGs, but are detectable in Lyman-o with a finite duty cycle 
in the range of 6 to 50%. 

Malhotra & Rhoads (2002) estimated this duty cycle by combining stellar population 
modelling with the extrapolated luminosity function of LBGs at z = 4 (Pozzetti et al. 1998; 
Steidel et al. 1999). The Lyman-o: emitters were modeled with different stellar population 
models to match the observed EW distribution. To match the number density of Lyman- 
ol emitters, only a small fraction of the inferred number of faint objects from the LBG 
luminosity function need to be active in Lyman-o; emission. This fraction is derived to be 
7.5% - 15% , depending on the stellar population model, the lower number corresponding 
to a zero-metallicity stellar population with an IMF slope of a — 2.35 and whose spectra at 
the age of 10 6 yr is derived by Tumlinson & Shull (2000). This is very consistent with the 
range of allowed duty cycles inferred from the clustering analysis presented here. However, 
the field-to-field variance in the number density of Lyman-o emitters is large, and analysis 
of more fields is needed before we can pin this value down further. Measurement of the 
correlation of Lyman-o emitters on smaller angular scales would allow us to better constrain 
the parameters of the halo occupation function, in particular its mass dependence o. 

This work made use of images provided by the NOAO Deep Wide-Field Survey (Jannuzi 
and Dey 1999), which is supported by the National Optical Astronomy Observatory (NOAO). 
NOAO is operated by AURA, Inc., under a cooperative agreement with the National Science 
Foundation. STScI is operated by the Association of Universities for Research in Astronomy, 
Inc., under NASA contract NAS5-26555. We thank Alex S. Szalay, Mauro Giavalisco and 
Tamas Budavari for useful discussions, and the latter also for the help with the inverse 
Limber transformation calculation. K.K. would like to to thank STScI for hospitality during 
the course of this work. 
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Table 1. Sample statistics 



Filter Numbers Surface density (arcsec 2 ) 



All filters 


151 


3.51 


X 


10" 


-5 


HO 


31 


7.20 


X 


10" 


-6 


H4 


39 


9.06 


X 


10" 


-6 


H8 


38 


8.83 


X 


10" 


-6 


H12 


66 


1.53 


X 


10" 


-5 


H16 


31 


7.20 


X 


10" 


-6 



Table 2. Correlation statistics parameters 





Measured values 






Halo occupation function parameters 


Type of data 


r [h- 1 Mpc] n [h 3 Mpc" 3 ] 


b 


a 


log(M min /h- 1 M Q ) 


logfMi/r^o) 











10.6351 




Observed 


3.20 6.0 x 10~ 4 


2.6 


0.5 


10.44 


14.76 








0.8 


10.20 


13.50 


Corrected 









11.40 




for 


4.61 4.3 x 10~ 4 


3.7 


0.5 


11.25 


13.58 


contamination 






0.8 


11.14 


12.97 
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Fig. 1. — The angular correlation function for the sample of 151 Lyman-o; emitters at z 4.5. 
The solid line is the best-fit to the modeled power law w(9) = A w O~°- s . 
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Fig. 2. — Left panel: Histogram of the best-fit amplitude A w from the Monte Carlo simu- 
lation. Right panel: Histogram of the spatial correlation length r , calculated via Limber 
equation from the simulated amplitudes whose distribution is shown in the left panel. 
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z 

Fig. 3. — Comparison of the correlation length of the Lyman-a emitters from this work 
with correlation lengths of other galaxy populations from the literature. The filled circle 
represents our measurement. The empty circle is the correlation length ro of Lyman-a; 
emitters at z = 4.86 from Ouchi et al. (2003). Triangles indicate correlation properties of 
LBGs. The open triangles show measurements for LBGs at z = 3 determined by Adelberger 
(2000) and at z = 4 determined by Ouchi et al. (2004). The last point is for a sample of 
the selected LBGs with i' < 26.0. The filled triangles are r values by Lee et al. (2005) 
calculated when both (3 and A w were allowed to vary. The point at z — 3.8 is the r value 
for B-dropouts and the point at z — 4.9 is the corresponding value for V-dropouts, both 
with the magnitude limit zsso — 27 . The low- and intermediate-redshift measurements of 
ro's are represented by empty star (Loveday et al. 1995; data from Stromlo-APM redshift 
survey), filled square (Zehavi et al. 2002; SDSS galaxies), empty square (Hawkins et al. 
2003; 2dFGRS galaxies), hexagons (Carlberg et al. 2000; data from Canadian Network 
for Observational Cosmology field galaxy redshift survey) and crosses (Brunner, Szalay, & 
Connoly 2000; data from field located at 14:20, +52:30, covering approximately 0.054 deg 2 , 
with photometrically measured redshifts). The dashed lines are r values as predicted by 
the "e-model" at different redshifts: the short-dashed line corresponds to parameter e = 0.8 
and long-dashed line corresponds to parameter e = 0. For comparison the solid line shows 
the redshift evolution of the spatial correlation length of dark matter given by equation Al 
in Moustakas &; Somerville (2002). Having the bias defined by equation [9] we conclude that 
high redshift galaxies are biased more strong than the galaxies from nearby samples and 
samples at intermediate redshifts. 
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Fig. 4. — Bias vs. the comoving number density is shown for our observed sample of Lyman- 
a emitters (open circle: uncorrected; solid circle: corrected for contamination), as well as for 
dark matter halos at z — 4.5 (solid line). Also shown are number density and bias values for 
Lyman-break galaxies at z — 3.8 (B-dropouts; squares) and z = 4.9 (V-dropouts; triangles) 
for three different magnitude limits (-2850 — 26, 26.5, and 27 from lowest to highest number 
density) from Lee et al. (2006). The dashed lines show the relations for dark matter halos 
at z — 3.8 (lower curve) and z = 4.9 (upper curve) for comparison with the LBGs. The 
Lyman-a emitters are less numerous than either dark matter halos or LBGs with similar 
bias values. 



